-
Notifications
You must be signed in to change notification settings - Fork 617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Define Flux tenancy models #2086
base: main
Are you sure you want to change the base?
Conversation
rfcs/0001-multi-tenancy/README.md
Outdated
|
||
The platform admins have unrestricted, cluster-scoped access to Kubernetes API. | ||
They are responsible for installing Flux and granting Flux | ||
access to the sources (Git, Helm, OCI repositories) that make up the cluster(s) control plane desired state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that tenants should not configure their own sources? In the tenants section it does however state "Register their sources with Flux". I might just be misinterpreting the meaning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is about the cluster control plane desired state
as in cluster-wide resources, controllers, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a new section "Tenants Onboarding". Hopefully this clarifies that tenants can add their app repos to their main repo which is registered by admins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was initially confused by this as well. It might pay to spend some paras up front explaining which git repositories are assumed to exist, and how they are used (i.e., what they contain).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would make sense to introduce a definition of the "control plane" in a separate paragraph at the beginning of the RFC somewhere. I'm thinking of everything that is either shared among tenants or created as part of the on-boarding of a tenant; the Flux instance itself, components such as Gatekeeper and resources such as ServiceAccounts.
It might also be helpful to explain the repo hierarchy: Each tenant has a root repo that's created by cluster admins and as many subsequent repos maintained by themselves.
e3d2c9e
to
df82c40
Compare
2c1c1c7
to
dcc754b
Compare
rfcs/0001-multi-tenancy/README.md
Outdated
|
||
Example of operations performed by tenants: | ||
|
||
- Register their sources with Flux (`GitRepositories`, `HelmRepositories` and `Buckets`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is said here above that cluster-admin Onboard tenants by registering their Git repositories with Flux
. This might need a clarification on separation of concern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a new section "Tenants Onboarding". Hopefully this clarifies that tenants can add their app repos to their main repo which is registered by admins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great overview of the current state-of-the-art with lots of good references for follow-up education. 👍 LGTM with changes, a few typos corrected.
76a8325
to
a417992
Compare
It's not uncommon to have a "memorandum" RFC which describes the status quo, rather than proposing a new design. It seems like a needless indirection to use an RFC to propose new documentation, giving the content verbatim, though. I would expect either
Given the goal of building up to new designs, I think the first is the appropriate form here (and would require only a little adaptation). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! (I'm with the MS/Azure group)
d12d812
to
befbd52
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall comment: 💯 👏 for the effort to definitively set out tenancy models for Flux. I think the content could be more pointed in how it does that, by
- being clear about which bits are definitions or assumptions;
- describing the models more directly -- in some places, the text lapses into being "how to" rather than being definitive.
rfcs/0001-multi-tenancy/README.md
Outdated
- List the tenancy models supported by Flux. | ||
- Explain the differences between tenancy models. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These aren't the only ways to set up a multi-tenant Flux system though, are they? This feels like it's partly a guide to good practice, rather than a reference. In which case, the language could be more like
- Define two models for multi-tenancy, "soft multi-tenancy" and "hard multi-tenancy"
- Explain when each is appropriate
- Describe a reference implementation of each model with Flux
(this distinguishes between definitions, which are normative; and implementations, of which there may be variations).
rfcs/0001-multi-tenancy/README.md
Outdated
|
||
### Hard Multi-Tenancy | ||
|
||
With hard multi-tenancy, the platform admins use Kubernetes Cluster API to create dedicated clusters for each tenant. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a strict requirement of the model? Or could the kubeConfig secrets come from some other mechanism, e.g., if clusters are constructed with terraform, or with clicking buttons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not clear on whether applying things remotely is required for the hard multi-tenancy, or kind of a mixed-in concern (if you're giving each tenant a cluster, you probably have a management cluster, so let's base that model on that assumption ...). Could you provide some justification in the text for this approach? Or explicitly give it as an assumption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D2iQ actually currenty implements hard multi-tenancy without kubeConfig
but instead we have controllers install Flux and create the sync resources on each tenant cluster. So I suppose kubeConfig
is one of several ways to enforce hard multi-tenancy.
rfcs/0001-multi-tenancy/README.md
Outdated
When onboarding tenants, platform admins have the option to assign namespaces, set | ||
permissions and register the tenants main repositories onto clusters in a declarative manner. | ||
|
||
The Flux CLI offers an easy way of generating all the Kubernetes manifests needed to onboard tenants: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and the examples following are "how to set up multi-tenancy" rather than describing the model or implementation. Demonstrating how to set it up is not a goal, in the text as it stands -- neither are describing the model or its implementation, but according to the PR title, perhaps they should be.
I suggest reworking this section to describe what the soft-tenancy model requires of RBAC (things like "each tenant namespace has a service account, with these bindings"). Telling people how to make it so conveniently, as you have here, is useful as extra information, but informative rather than definitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expanding the RBAC recommendations here would be really useful.
It would be good to ensure we cater for protecting the tenant's service account from being misused.
Here's some ideas:
A) Vanila K8S
The Platform Admin would pre-create all namespaces the tenant will use ahead of time, setting access via rolebindings for all the tenant's namespaces.
All Flux objects are created at the tenant flux namespace.
flux-tenant-alpha
├── flux-tenant-alpha-serviceaccount
├── flux-tenant-alpha-rolebinding
├── podinfo-helmrepository
└── podinfo-helmrelease
apps
├── flux-tenant-alpha-rolebinding
└── podinfo
B) HNC
The Platform Admin pre-creates the tenant top level namespace, with its service account and rolebindings.
All Flux objects are created at the tenant top level namespace.
Tenants can create subnamespaces and deploy apps to it.
flux-tenant-alpha
├── flux-tenant-alpha-serviceaccount
├── flux-tenant-alpha-rolebinding
├── podinfo-helmrepository
├── podinfo-helmrelease
└── [ns] apps
├── flux-tenant-alpha-rolebinding
└── podinfo
In both cases, the "deployment" service account is never placed on a namespace that is shared with other applications.
If a tenant's flux namespace needs to have mixed use (shared between applications and flux components), it would require admission controllers to block the misuse of the tenant's service account.
C) Vanila K8S + Admission Controllers
flux-tenant-alpha
├── flux-tenant-alpha-serviceaccount (i.e. kyverno policy to block misuse of this service account)
├── flux-tenant-alpha-rolebinding
├── podinfo-helmrepository
└── podinfo-helmrelease
└── podinfo
rfcs/0001-multi-tenancy/README.md
Outdated
- [EKS multi-tenancy best practices](https://aws.github.io/aws-eks-best-practices/security/docs/multitenancy/) | ||
|
||
### Soft Multi-Tenancy | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The paras here are a nice, and concise, explanation 💟
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
rfcs/0001-multi-tenancy/README.md
Outdated
make use of it without any manual actions. For clusters created by other means than Cluster API, the | ||
platform team has to create the `kubeConfig` secrets to allow Flux access to the remote clusters. | ||
|
||
As of Flux v0.23.0, we don't provide any guidance for cluster admins on how to generate the `kubeConfig` secrets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The text above says they come from Cluster API.
rfcs/0001-multi-tenancy/README.md
Outdated
|
||
## Motivation | ||
|
||
The documentation [here](https://fluxcd.io/docs/) describes the security model of Flux. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation [here](https://fluxcd.io/docs/) describes the security model of Flux. | |
The documentation [here](https://fluxcd.io/docs/security/) describes the security model of Flux. |
Isn't this the more concrete page? The main one doesn't mention security.
rfcs/0001-multi-tenancy/README.md
Outdated
|
||
## Introduction | ||
|
||
Flux allows different organizations and/or teams to share the same Kubernetes control plane. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recall someone (maybe it was @stefanprodan) telling me there shouldn't be multiple instances of Flux running on a single cluster (which could help in isolating tenants). Maybe that notion should be part of this doc as some kind of "official guidance"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are configuration options in which this theoretically still is a solution, but need to adhere to a set of rules that do not apply to most.
rfcs/0001-multi-tenancy/README.md
Outdated
|
||
## User Roles | ||
|
||
The tenancy models assume two types of user: platform admins and tenants. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tenancy models assume two types of user: platform admins and tenants. | |
The existing Flux tenancy models assume two types of user: platform admins and tenants. |
Not sure if that's the intention here but I figure a bit of clarification of which tenancy model we're talking about here might be helpful.
rfcs/0001-multi-tenancy/README.md
Outdated
|
||
The platform admins have unrestricted, cluster-scoped access to Kubernetes API. | ||
They are responsible for installing Flux and granting Flux | ||
access to the sources (Git, Helm, OCI repositories) that make up the cluster(s) control plane desired state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would make sense to introduce a definition of the "control plane" in a separate paragraph at the beginning of the RFC somewhere. I'm thinking of everything that is either shared among tenants or created as part of the on-boarding of a tenant; the Flux instance itself, components such as Gatekeeper and resources such as ServiceAccounts.
It might also be helpful to explain the repo hierarchy: Each tenant has a root repo that's created by cluster admins and as many subsequent repos maintained by themselves.
rfcs/0001-multi-tenancy/README.md
Outdated
|
||
### Hard Multi-Tenancy | ||
|
||
With hard multi-tenancy, the platform admins use Kubernetes Cluster API to create dedicated clusters for each tenant. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D2iQ actually currenty implements hard multi-tenancy without kubeConfig
but instead we have controllers install Flux and create the sync resources on each tenant cluster. So I suppose kubeConfig
is one of several ways to enforce hard multi-tenancy.
rfcs/0001-multi-tenancy/README.md
Outdated
Note that with hard multi-tenancy, tenants have full access to cluster-wide resources, so they have the option | ||
to manage Flux independently of platform admins, by deploying a Flux instance on each cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should mention here that hard multi-tenancy can be combined with soft multi-tenancy to get around this limitation.
rfcs/0001-multi-tenancy/README.md
Outdated
|
||
The Kubernetes tenancy models supported by Flux are: soft multi-tenancy and hard multi-tenancy. | ||
|
||
For an overview of the Kubernetes multi-tenant architecture please consult the following documentation: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AKS multi-tenancy doc: https://docs.microsoft.com/azure/aks/operator-best-practices-cluster-isolation
rfcs/0001-multi-tenancy/README.md
Outdated
|
||
## Tenancy Models | ||
|
||
The Kubernetes tenancy models supported by Flux are: soft multi-tenancy and hard multi-tenancy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the four sources below (Kubernetes, GCP, Azure and AWS) only AWS uses the terms soft
and hard
for multi-tenancy. It would be useful to expand slightly here to clarify what we mean by it, which may speak to the RFC's goal of "Explain when each model is appropriate.".
Some ideas:
Soft Multi-tenancy | Hard Multi-tenancy | |
---|---|---|
Tenants may share cluster with other tenants | Yes | No |
Tenants may share cluster with the flux management instance | Yes | No |
Tenants access to cluster-wide resources | Limited | Unrestricted |
rfcs/0001-multi-tenancy/README.md
Outdated
- [EKS multi-tenancy best practices](https://aws.github.io/aws-eks-best-practices/security/docs/multitenancy/) | ||
|
||
### Soft Multi-Tenancy | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
rfcs/0001-multi-tenancy/README.md
Outdated
Note that with soft multi-tenancy, true tenant isolation requires security measures beyond Kubernetes RBAC. | ||
Please refer to the Kubernetes [security considerations documentation](https://kubernetes.io/blog/2021/04/15/three-tenancy-models-for-kubernetes/#security-considerations) | ||
for more details on how to harden shared clusters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether we need a small multi-tenancy security section on its own, as similar points may be valid for hard multi-tenancy - although at a lower level of the stack.
The key point being that flux support several multi-tenancy use cases, but the Platform Admin is ultimately the responsible for ensuring the correct level of isolation is enforced between the tenants, based on their own security requirements.
rfcs/0001-multi-tenancy/README.md
Outdated
When onboarding tenants, platform admins have the option to assign namespaces, set | ||
permissions and register the tenants main repositories onto clusters in a declarative manner. | ||
|
||
The Flux CLI offers an easy way of generating all the Kubernetes manifests needed to onboard tenants: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expanding the RBAC recommendations here would be really useful.
It would be good to ensure we cater for protecting the tenant's service account from being misused.
Here's some ideas:
A) Vanila K8S
The Platform Admin would pre-create all namespaces the tenant will use ahead of time, setting access via rolebindings for all the tenant's namespaces.
All Flux objects are created at the tenant flux namespace.
flux-tenant-alpha
├── flux-tenant-alpha-serviceaccount
├── flux-tenant-alpha-rolebinding
├── podinfo-helmrepository
└── podinfo-helmrelease
apps
├── flux-tenant-alpha-rolebinding
└── podinfo
B) HNC
The Platform Admin pre-creates the tenant top level namespace, with its service account and rolebindings.
All Flux objects are created at the tenant top level namespace.
Tenants can create subnamespaces and deploy apps to it.
flux-tenant-alpha
├── flux-tenant-alpha-serviceaccount
├── flux-tenant-alpha-rolebinding
├── podinfo-helmrepository
├── podinfo-helmrelease
└── [ns] apps
├── flux-tenant-alpha-rolebinding
└── podinfo
In both cases, the "deployment" service account is never placed on a namespace that is shared with other applications.
If a tenant's flux namespace needs to have mixed use (shared between applications and flux components), it would require admission controllers to block the misuse of the tenant's service account.
C) Vanila K8S + Admission Controllers
flux-tenant-alpha
├── flux-tenant-alpha-serviceaccount (i.e. kyverno policy to block misuse of this service account)
├── flux-tenant-alpha-rolebinding
├── podinfo-helmrepository
└── podinfo-helmrelease
└── podinfo
These were adapted from the multi-tenancy RFC: #2086 Signed-off-by: Michael Bridgen <[email protected]>
f884db8
to
18091b4
Compare
Signed-off-by: Stefan Prodan <[email protected]>
Signed-off-by: Stefan Prodan <[email protected]>
The multi-tenancy implementations described rely on impersonation and remote apply; to make this RFC stand by itself, those need to be explained, along with the authorisation model (how Flux "decides" what it's allowed to do). This commit adds a summary of the authorisation model, impersonation, and remote apply, and rejigs the headings a little to make space. Signed-off-by: Michael Bridgen <[email protected]>
Signed-off-by: Stefan Prodan <[email protected]>
18091b4
to
e0bc754
Compare
This gives a baseline for future changes, e.g., expanding where namespace ACLs are used, switching access control to untrusted-by-default. The "Security considerations" section was adapted from #2086 Signed-off-by: Michael Bridgen <[email protected]>
This gives a baseline for future changes, e.g., expanding where namespace ACLs are used, switching access control to untrusted-by-default. The "Security considerations" section was adapted from #2086 Signed-off-by: Michael Bridgen <[email protected]>
This gives a baseline for future changes, e.g., expanding where namespace ACLs are used, switching access control to untrusted-by-default. The "Security considerations" section was adapted from fluxcd#2086 Signed-off-by: Michael Bridgen <[email protected]>
The main goal of this RFC is to define the Kubernetes tenancy models supported by Flux.
This PR attempts to document the status quo, and should provide clarity of what multi-tenancy capabilities Flux has. It also functions as a base for rewriting the loose proposal in #582 into well scoped RFCs.